Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Exact Match option to Suggest #15410

Closed
wants to merge 4 commits into from

Conversation

ricardocerq
Copy link

Solves #11579.

Added option to term suggest that returns the input term if it exists. Turned off by default so it doesn't break normal behaviour.

The example query given in #11579 would now be:

"Suggest": {
      "term": {
        "field": "word",
        "suggest_mode": "popular",
        "size": 2,
        "prefix_len": 1,
        "analyzer": "default",
        "exact_matching": true
      },
      "text": "Software"
    }

And the result:

"Suggest": [
      {
         "text": "software",
         "offset": 0,
         "length": 8,
         "options": [
            {
               "text": "software",
               "score": 1,
               "freq": 1
            }
         ]
      }
   ]

The output now clearly indicates if the term is present. If it is, its score is always 1, representing an exact match, which differentiates this result from other suggestions.

@clintongormley clintongormley added >enhancement review :Search/Suggesters "Did you mean" and suggestions as you type labels Dec 14, 2015
@clintongormley
Copy link

@areek could you have a look at this please.
@ricardocerq you'll also need a test in the PR before it will be merged.

@@ -56,7 +59,10 @@ public TermSuggestion innerExecute(String name, TermSuggestionContext suggestion
);
Text key = new BytesText(new BytesArray(token.term.bytes()));
TermSuggestion.Entry resultEntry = new TermSuggestion.Entry(key, token.startOffset, token.endOffset - token.startOffset);
for (SuggestWord suggestWord : suggestedWords) {
if(suggestion.getDirectSpellCheckerSettings().exactMatch()){
addExactMatch( suggestion, indexReader, token, resultEntry);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove addExactMatch method altogether and just use the TermsEnum#seekExact to see if the term is present in the term dictionary:

final TermsEnum termsEnum = MultiFields.getTerms(indexReader, token.term.field()).iterator();
if (termsEnum.seekExact(token.term.bytes())) {
    Text word = new StringText(token.term.text());
    resultEntry.addOption(new TermSuggestion.Entry.Option(word, termsEnum.docFreq(), 1f));
}

@areek
Copy link
Contributor

areek commented Dec 14, 2015

Thanks @ricardocerq for the PR :). I left some comments. Would be awesome if we could have a test for the new setting too. You could have a look at SuggestSearchTests.java for some inspiration.

@ricardocerq
Copy link
Author

Thanks for the attention and the feedback! I'm making the necessary changes. :)

@clintongormley
Copy link

@areek could you review this again please?

@dakrone
Copy link
Member

dakrone commented Apr 6, 2016

pinging @areek again for review :)

@@ -56,7 +59,14 @@ public TermSuggestion innerExecute(String name, TermSuggestionContext suggestion
);
Text key = new BytesText(new BytesArray(token.term.bytes()));
TermSuggestion.Entry resultEntry = new TermSuggestion.Entry(key, token.startOffset, token.endOffset - token.startOffset);
for (SuggestWord suggestWord : suggestedWords) {
if (suggestion.getDirectSpellCheckerSettings().exactMatch()){
final TermsEnum termsEnum = MultiFields.getTerms(indexReader, token.term.field()).iterator();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation is off here

@areek
Copy link
Contributor

areek commented Apr 7, 2016

@ricardocerq sorry for the delay, this LGTM, would you mind updating the PR with master?

@dakrone
Copy link
Member

dakrone commented Sep 12, 2016

@ricardocerq are you able to update this by rebasing or merging master in so it can be merged?

@ricardocerq
Copy link
Author

Sorry about the delay :(. I was unable to retest the changes. If more changes are necessary, please let me know.

@worldsayshi
Copy link

@areek @clintongormley Can this be merged?

Copy link
Contributor

@s1monw s1monw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the merge wasn't really successful afaik. There are about 1400 lines of unnecessary code in this PR. Otherwise it looks ok but I can't merge it like this. I left comments inline

@@ -0,0 +1,301 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this entire class seems unused?

@@ -137,6 +139,14 @@ public void minDocFreq(float minDocFreq) {
this.minDocFreq = minDocFreq;
}

public boolean exactMatch() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have javadocs on this?

@@ -0,0 +1,1329 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a full copy of our tests that has been moved back to core into org.elasticsearch.search.suggest.SuggestSearchIT as far as I can tell only testExactMatch should be added?

@clintongormley
Copy link

@ricardocerq are you still interested in completing this PR?

@clintongormley
Copy link

Closing in favour of #22902

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement feedback_needed :Search/Suggesters "Did you mean" and suggestions as you type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants